leaflet package1. Goal
This tutorial explains how to do some very basic spatial analysis with polygon and point data. The exercise focuses on counting the occurences of points within polygons and plotting the results as a heatmap using the leaflet package. To that end we’ll follow these steps:
rgdal)sp, raster)leaflet, htmltools, htmlwidgets)For a more in-depth tutorial on GIS and Spatial Analysis in R please see https://mgimond.github.io/Spatial/index.html.
2. Data sources
Both shapefiles were downloaded from the NYC Open data:
Zip Code Boundaries shapefile from https://data.cityofnewyork.us/Business/Zip-Code-Boundaries/i8iw-xf4u/dataSubway Entrances to NYC subway stations from https://data.cityofnewyork.us/Transportation/Subway-Entrances/drex-xx56/data3. Shapefile data
Before we being set your working directory:
setwd("/Users/ola/Desktop/EDAV_Robbins/community/")
Shapefile format is a popular geospatial vector data format for GIS (geographic information system) software (from https://en.wikipedia.org/wiki/Shapefile) The information is stored in several files with different extensions, some of which are mandatory:
Others, like .prj (a plain text file with the coordinate system and projection information), are optional.
rgdal is R’s interface to the popular C/C++ spatial data processing library gdal and is useful for handling shapefiles (https://cran.r-project.org/web/packages/rgdal/rgdal.pdf).
Using the function rgdal::ogrInfo we can get some info regarding the shapefiles, such as:
the 1st argument of the rgdal::ogrInfo is the name of the directory in the working directory the shapefile is in and the 2nd argument is the name of the file without the extension.
library(rgdal)
rgdal::ogrInfo("ZIP_CODE_040114", "ZIP_CODE_040114")
## Source: "ZIP_CODE_040114", layer: "ZIP_CODE_040114"
## Driver: ESRI Shapefile; number of rows: 263
## Feature type: wkbPolygon with 2 dimensions
## Extent: (913129 120020.9) - (1067494 272710.9)
## CRS: +proj=lcc +lat_1=40.66666666666666 +lat_2=41.03333333333333 +lat_0=40.16666666666666 +lon_0=-74 +x_0=300000 +y_0=0 +datum=NAD83 +units=us-ft +no_defs
## LDID: 87
## Number of fields: 12
## name type length typeName
## 1 ZIPCODE 4 5 String
## 2 BLDGZIP 4 1 String
## 3 PO_NAME 4 28 String
## 4 POPULATION 2 19 Real
## 5 AREA 2 19 Real
## 6 STATE 4 2 String
## 7 COUNTY 4 20 String
## 8 ST_FIPS 4 2 String
## 9 CTY_FIPS 4 3 String
## 10 URL 4 200 String
## 11 SHAPE_AREA 2 19 Real
## 12 SHAPE_LEN 2 19 Real
rgdal::ogrInfo("Subway_Entrances", "geo_export_f1302dae-758c-45a4-9f89-a6d045604655")
## Source: "Subway_Entrances", layer: "geo_export_f1302dae-758c-45a4-9f89-a6d045604655"
## Driver: ESRI Shapefile; number of rows: 1928
## Feature type: wkbPoint with 2 dimensions
## Extent: (-74.25283 40.51211) - (-73.75418 40.9036)
## CRS: +proj=longlat +ellps=WGS84 +no_defs
## LDID: 0
## Number of fields: 4
## name type length typeName
## 1 objectid 2 33 Real
## 2 url 4 254 String
## 3 name 4 254 String
## 4 line 4 254 String
Read shapefiles into spatial dataframe objects using rgdal::readOGR (arguments analogical to rgdal::ogrInfo).
zip_polygons <- rgdal::readOGR("ZIP_CODE_040114", "ZIP_CODE_040114")
## OGR data source with driver: ESRI Shapefile
## Source: "ZIP_CODE_040114", layer: "ZIP_CODE_040114"
## with 263 features
## It has 12 fields
subway_entrances_points <- rgdal::readOGR("Subway_Entrances", "geo_export_f1302dae-758c-45a4-9f89-a6d045604655")
## OGR data source with driver: ESRI Shapefile
## Source: "Subway_Entrances", layer: "geo_export_f1302dae-758c-45a4-9f89-a6d045604655"
## with 1928 features
## It has 4 fields
Note the type of the resulting objects (SpatialPolygonsDataFrame and SpatialPointsDataFrame):
class(zip_polygons)
## [1] "SpatialPolygonsDataFrame"
## attr(,"package")
## [1] "sp"
class(subway_entrances_points)
## [1] "SpatialPointsDataFrame"
## attr(,"package")
## [1] "sp"
We can access the data (all but the latitude and longitude info), which is stored in the data slot using @data:
head(zip_polygons@data)
## ZIPCODE BLDGZIP PO_NAME POPULATION AREA STATE COUNTY ST_FIPS
## 0 11436 0 Jamaica 18681 22699295 NY Queens 36
## 1 11213 0 Brooklyn 62426 29631004 NY Kings 36
## 2 11212 0 Brooklyn 83866 41972104 NY Kings 36
## 3 11225 0 Brooklyn 56527 23698630 NY Kings 36
## 4 11218 0 Brooklyn 72280 36868799 NY Kings 36
## 5 11226 0 Brooklyn 106132 39408598 NY Kings 36
## CTY_FIPS URL SHAPE_AREA SHAPE_LEN
## 0 081 http://www.usps.com/ 0 0
## 1 047 http://www.usps.com/ 0 0
## 2 047 http://www.usps.com/ 0 0
## 3 047 http://www.usps.com/ 0 0
## 4 047 http://www.usps.com/ 0 0
## 5 047 http://www.usps.com/ 0 0
head(subway_entrances_points@data)
## objectid url
## 1 1734 http://web.mta.info/nyct/service/
## 2 1735 http://web.mta.info/nyct/service/
## 3 1736 http://web.mta.info/nyct/service/
## 4 1737 http://web.mta.info/nyct/service/
## 5 1738 http://web.mta.info/nyct/service/
## 6 1739 http://web.mta.info/nyct/service/
## name line
## 1 Birchall Ave & Sagamore St at NW corner 2-5
## 2 Birchall Ave & Sagamore St at NE corner 2-5
## 3 Morris Park Ave & 180th St at NW corner 2-5
## 4 Morris Park Ave & 180th St at NW corner 2-5
## 5 Boston Rd & 178th St at SW corner 2-5
## 6 Boston Rd & E Tremont Ave at NW corner 2-5
Latitude and longitude data for each polygon is stored in polygons slot, so, for istance, we can access the latitude and longitude data of the 1st polygon using:
head(zip_polygons@polygons[[1]]@Polygons[[1]]@coords)
## [,1] [,2]
## [1,] 1038098 188138.4
## [2,] 1038142 188154.8
## [3,] 1038171 188165.8
## [4,] 1038280 188206.7
## [5,] 1038521 188296.8
## [6,] 1038764 188388.4
4. Projection
Note the extent of the latitude and longitued data for polygons as outputted by rgdal::ogrInfo: ## Extent: (913129 120020.9) - (1067494 272710.9). This doesn’t look like the latitude and longitude of NYC we are used to (latitude ~40N or +40, longitude ~74W or -74). In contrast, the extent for the points file is more what we expect: ## Extent: (-74.25283 40.51211) - (-73.75418 40.9036).
This is because the projection of the polygon shapefile is different - it is not the the usual WGS (World Geodetic System) projection, as it is in the case of the points data. To be able to join the two files spatially and also to plot them using the leaflet package we will tarnsform the projection of the polygon data using sp::spTransform.
sp is R’s Classes and Methods for Spatial Data (https://cran.r-project.org/web/packages/sp/sp.pdf). sp::spTransform takes the spatial dataframe object to be transformed as the 1st argument and the target coordinate reference system object as the 2nd argument (using the class CRS). We project the polygon file to the target CRS equal to that of the points file, which we can obtain using rgdal::ogrInfo: ## CRS: +proj=longlat +ellps=WGS84 +no_defs.
library(sp)
zip_polygons_projected <- sp::spTransform(zip_polygons, CRS("+proj=longlat +ellps=WGS84 +no_defs"))
head(zip_polygons_projected@polygons[[1]]@Polygons[[1]]@coords)
## [,1] [,2]
## [1,] -73.80585 40.68291
## [2,] -73.80569 40.68295
## [3,] -73.80558 40.68298
## [4,] -73.80519 40.68310
## [5,] -73.80432 40.68334
## [6,] -73.80345 40.68359
Note that projection and the extent of the projected file is the same as the points file. We can access this info directly using functions from package raster, which is another R’s geographic data analysis and modeling package (https://cran.r-project.org/web/packages/raster/raster.pdf).
library(raster)
raster::projection(zip_polygons_projected)
## [1] "+proj=longlat +ellps=WGS84 +no_defs"
raster::projection(subway_entrances_points)
## [1] "+proj=longlat +ellps=WGS84 +no_defs"
raster::extent(zip_polygons_projected)
## class : Extent
## xmin : -74.25576
## xmax : -73.6996
## ymin : 40.49584
## ymax : 40.91517
raster::extent(subway_entrances_points)
## class : Extent
## xmin : -74.25283
## xmax : -73.75418
## ymin : 40.51211
## ymax : 40.9036
5. Visualize shapes
Let’s quickly build a default leaflet map with just two layers (addPolygons for polygons and addCircles for points) to visualize the shapes and make sure thet plot in the same geographical area:
library(leaflet)
initial_map <- leaflet() %>%
# polygons
leaflet::addPolygons(data = zip_polygons_projected) %>%
# points
leaflet::addCircles(data = subway_entrances_points)
initial_map
In the map above we can see that the polygons form the familiar shape of the five boroughs and that the points lie within the polygons. Good, this is what we expected.
Leaflet allows for some interaction with the produced map. In the map above it is possible to zoom in and out to see more detail.
5. Spatial join
Now we use sp::over function to perform a spatial join of the points dataframe with the polygons dataframe. sp::over takes the point dataframe as the 1st argument and the polygons dataframe as the 2nd. It creates a list of all points with the polygons they fall inside. Note the number of rows of this dataframe is exactly the same as of the subway_entrances_points dataframe (1928) and the columns are exacly like in the zip_polygons_projected dataframe. This is because sp::over creates one entry for each point in the new dataframe with all features equal to the features of the polygon that points is inside.
point_in_polygon <- sp::over(subway_entrances_points, zip_polygons_projected)
class(point_in_polygon)
## [1] "data.frame"
nrow(point_in_polygon)
## [1] 1928
head(point_in_polygon)
## ZIPCODE BLDGZIP PO_NAME POPULATION AREA STATE COUNTY ST_FIPS
## 1 10462 0 Bronx 75674 53022514 NY Bronx 36
## 2 10462 0 Bronx 75674 53022514 NY Bronx 36
## 3 10460 0 Bronx 56670 35155667 NY Bronx 36
## 4 10460 0 Bronx 56670 35155667 NY Bronx 36
## 5 10460 0 Bronx 56670 35155667 NY Bronx 36
## 6 10460 0 Bronx 56670 35155667 NY Bronx 36
## CTY_FIPS URL SHAPE_AREA SHAPE_LEN
## 1 005 http://www.usps.com/ 0 0
## 2 005 http://www.usps.com/ 0 0
## 3 005 http://www.usps.com/ 0 0
## 4 005 http://www.usps.com/ 0 0
## 5 005 http://www.usps.com/ 0 0
## 6 005 http://www.usps.com/ 0 0
We turn that dataframe into a dataframe with columnns: ZIPCODE and Freq using the table function on the ZIPCODE column of point_in_polygon.
zip_point_count <- as.data.frame(table(point_in_polygon$ZIPCODE))
head(zip_point_count)
## Var1 Freq
## 1 00083 5
## 2 10001 50
## 3 10002 17
## 4 10003 28
## 5 10004 20
## 6 10005 14
Since table automatically assigns names to columns, we change the name of the column with zipcodes (Var1) so that it is the same as the name of the column in the zip_polygons_projected dataframe. This is necessary so we can merge these dataframes.
names(zip_point_count)[names(zip_point_count)=="Var1"] <- "ZIPCODE"
head(zip_point_count)
## ZIPCODE Freq
## 1 00083 5
## 2 10001 50
## 3 10002 17
## 4 10003 28
## 5 10004 20
## 6 10005 14
Perform a left-merge (all.x = TRUE) on these dataframes, effectively adding a column with the count of subway entrances to the zip_polygons_projected dataframe. Do the join on the ZIPCODE column (by = "ZIPCODE") - that’s why we had to rename it in the previous step.
zip_polygons_final <- merge(x = zip_polygons_projected, y = zip_point_count, by = "ZIPCODE", all.x = TRUE)
head(zip_polygons_final)
## ZIPCODE BLDGZIP PO_NAME POPULATION AREA STATE COUNTY ST_FIPS
## 254 11436 0 Jamaica 18681 22699295 NY Queens 36
## 176 11213 0 Brooklyn 62426 29631004 NY Kings 36
## 175 11212 0 Brooklyn 83866 41972104 NY Kings 36
## 188 11225 0 Brooklyn 56527 23698630 NY Kings 36
## 181 11218 0 Brooklyn 72280 36868799 NY Kings 36
## 189 11226 0 Brooklyn 106132 39408598 NY Kings 36
## CTY_FIPS URL SHAPE_AREA SHAPE_LEN Freq
## 254 081 http://www.usps.com/ 0 0 0
## 176 047 http://www.usps.com/ 0 0 12
## 175 047 http://www.usps.com/ 0 0 10
## 188 047 http://www.usps.com/ 0 0 14
## 181 047 http://www.usps.com/ 0 0 14
## 189 047 http://www.usps.com/ 0 0 19
The spatial join is now complete and we can start plotting the data.
It should be noted that there are other ways to perform a spatial join in R that e.g. retain the data for both joined datasets (http://www.nickeubank.com/wp-content/uploads/2015/10/RGIS2_MergingSpatialData_part1_Joins.html). However, since in this exerciswe we are interested only in the count of points within each polygon, sp::over function is sufficient.
6. Make a map
Now onto plotting our dataframes using leaflet. leaflet allows to create interactive web maps with JavaScript (https://cran.r-project.org/web/packages/leaflet/leaflet.pdf). There is plenty of tuorials already existing for leaflet, such as https://rstudio.github.io/leaflet/. Here I’m only going to get into very basic functionality that is needed to complete the exercise.
6.1 Color palette
Set palette of colors for polygons - they are to be colored by the count of subway entrances (Freq column of zip_polygons_final) from yellow (low count) to red (high count). We use colorBin function of the leaflet package because the data is numeric and discrete.
library(leaflet)
palette <- leaflet::colorBin("YlOrRd", domain = zip_polygons_final$Freq)
6.2 Labels for points and polygons
Set labels of the polygons using html markup (htmltools package). Label string will have 3 lines derived from the zip_polygons_final data frame:
library(htmltools)
poly_labels <- sprintf("<strong>ZIP: %s</strong><br/>%s<br/>Num. entrances: %s",
zip_polygons_final$ZIPCODE, zip_polygons_final$PO_NAME, zip_polygons_final$Freq) %>%
lapply(htmltools::HTML)
Set labels of the points. Label string will have 2 lines derived from the subway_entrances_points data frame:
point_labels <- sprintf("Line: %s<br/>Name: %s",
subway_entrances_points$line, subway_entrances_points$name) %>%
lapply(htmltools::HTML)
6.3 Map opbject
Create a map object using leaflet package with the following layers:
addTiles or addProviderTiles function (I used the latter)addPolygonsaddMarkers (for markers with option of labels) or addCircles for non-interactive pointsaddLegendmyNYCmap <- leaflet() %>%
# base map
leaflet::addProviderTiles(providers$CartoDB.Positron) %>%
# polygons
leaflet::addPolygons(data = zip_polygons_final,
color = "#444444",
weight = 1,
smoothFactor = 0.5,
opacity = 1.0,
fillOpacity = 0.5,
fillColor = ~palette(Freq),
# change display properties while mouse cursor hovers on the polygon
highlightOptions = highlightOptions(color = "white",
weight = 2,
bringToFront = TRUE),
label = poly_labels,
# label display options
labelOptions = labelOptions(style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto")) %>%
# point markers
leaflet::addMarkers(data = subway_entrances_points,
label = point_labels) %>%
# legend
leaflet::addLegend(pal = palette,
values = zip_polygons_final$Freq,
opacity = 0.7,
title = "Num. entrances",
position = "topright")
6.5 Display the map
myNYCmap
The above map features the following functionalities:
You can also save your map using htmlwidgets package.
library(htmlwidgets)
htmlwidgets::saveWidget(myNYCmap, 'myNYCmap.html', selfcontained = TRUE)
In case you are annoyed with the size and the density of the markers in the previous map, you can also plot the points using leaflet::addCircles. This option, however, doesn’t allow for labels of the points appearing on mouse-over.
myNYCmap_circles <- leaflet() %>%
# base map
leaflet::addProviderTiles(providers$CartoDB.Positron) %>%
# polygons
leaflet::addPolygons(data = zip_polygons_final,
color = "#444444",
weight = 1,
smoothFactor = 0.5,
opacity = 1.0,
fillOpacity = 0.5,
fillColor = ~palette(Freq),
# change display properties while mouse cursor hovers on the polygon
highlightOptions = highlightOptions(color = "white",
weight = 2,
bringToFront = TRUE),
label = poly_labels,
# label display options
labelOptions = labelOptions(style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "15px",
direction = "auto")) %>%
# point markers
leaflet::addCircles(data = subway_entrances_points) %>%
# legend
leaflet::addLegend(pal = palette,
values = zip_polygons_final$Freq,
opacity = 0.7,
title = "Num. entrances",
position = "topright")
htmlwidgets::saveWidget(myNYCmap_circles, 'myNYCmap_circles.html', selfcontained = TRUE)
myNYCmap_circles
References:
Data sources:
https://data.cityofnewyork.us/Business/Zip-Code-Boundaries/i8iw-xf4u/data
https://data.cityofnewyork.us/Transportation/Subway-Entrances/drex-xx56/data
Packages:
https://cran.r-project.org/web/packages/rgdal/rgdal.pdf
https://cran.r-project.org/web/packages/sp/sp.pdf
https://cran.r-project.org/web/packages/raster/raster.pdf
https://cran.r-project.org/web/packages/leaflet/leaflet.pdf
Tutorials:
http://www.nickeubank.com/wp-content/uploads/2015/10/RGIS2_MergingSpatialData_part1_Joins.html
https://rstudio.github.io/leaflet/
https://mgimond.github.io/Spatial/index.html
Other: